On the availability page, you can see two calculated values called MTBF (mean time between failures) and MTTR (mean time to recovery).
Here's what they mean:
Note that RHQ supports two other availability types in addition to UP and DOWN:
UNKNOWN: No data has been collected and thus it is not known if the resource is up or down.
DISABLE: This can be considered "administratively down". Someone has denoted the resource as currently disabled, usually to indicate that some known maintenance period is or will be started and thus if the resource goes down, it should not be considered a failure. In other words, it is expected that the resource will be down.
For the purposes of MTBF and MTTR calculations, any period of DISABLED availability is not considered DOWN since a user has said this DISABLED period is expected. In addition, UNKNOWN periods are equally ignored since no true determination of the state of the resource can be made.
The availability summary page in the GWT UI shows the following statistics for the resource:
Availability: percentage of time the resource has been UP
Uptime: duration of time the resource has been UP
Down: percentage of time the resource has been DOWN
Downtime: duration of time the resource has been DOWN
Disabled: percentage of time the resource has been DISABLED
Disabled Time: duration of time the resource has been DISABLED
Number of Failures: number of times the resource has transitioned to DOWN
Number of Times Disabled: number of times the resource was DISABLED
MTBF: in short, the average time the resource was UP (see above for more details)
MTTR: in short, the average time the resource was DOWN (above for more details)
Now that the percentages do NOT include the times when the resource was in an UNKNOWN state. Since no determination can be made regarding what state the resource was in, those time periods will be ignored in those calculations.